Prospect-theoretic Q-learning

نویسندگان

چکیده

We consider a prospect theoretic version of the classical Q-learning algorithm for discounted reward Markov decision processes, wherein controller perceives distorted and noisy future reward, modeled by nonlinearity that accentuates gains under-represents losses relative to reference point. analyze asymptotic behavior scheme analyzing its limiting differential equation using theory monotone dynamical systems infer behavior. Specifically, we show convergence equilibria, establish some qualitative facts about equilibria themselves.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fragility of the Commons under Prospect-Theoretic Risk Attitudes

We study a common-pool resource game where the resource experiences failure with a probability that grows with the aggregate investment in the resource. To capture decision making under such uncertainty, we model each player’s risk preference according to the value function from prospect theory. We show the existence and uniqueness of a pure strategy Nash equilibrium when the players have arbit...

متن کامل

Non-Rational Discrete Choice Based On Q-Learning And The Prospect Theory

When modelling human discrete choice the standard approach is to adopt the rational model. This has been shown, however, to fail systematically under some conditions, which makes evident the need for a better approach. The choice model is however only part of the problem because it does not say how to deal with uncertainty, where learning is necessary. In this regard, some evidences support the...

متن کامل

Reactive Power Compensation Game under Prospect-Theoretic Framing Effects

Reactive power compensation is an important challenge in current and future smart power systems. However, in the context of reactive power compensation, most existing studies assume that customers can assess their compensation value, i.e., Var unit, objectively. In this paper, customers are assumed to make decisions that pertain to reactive power coordination. In consequence, the way in which t...

متن کامل

P14: Anxiety Control Using Q-Learning

Anxiety disorders are the most common reasons for referring to specialized clinics. If the response to stress changed, anxiety can be greatly controlled. The most obvious effect of stress occurs on circulatory system especially through sweating. the electrical conductivity of skin or in other words Galvanic Skin Response (GSR) which is dependent on stress level is used; beside this parameter pe...

متن کامل

Evaluating project’s completion time with Q-learning

Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Systems & Control Letters

سال: 2021

ISSN: ['1872-7956', '0167-6911']

DOI: https://doi.org/10.1016/j.sysconle.2021.105009